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TESTS  FOR  JOINT  NORMALITY  IN  TIME  SERIES 


1 .  Motivation 

'^The  well-known  methods  for  analysis  of  time  series,  whether  in  the 
time  domain  or  in  the  frequency  domain  —  for  fitting  parametric  struc¬ 
tures,  for  regression,  for  forecasting  —  all  involve  second-moment  sta¬ 
tistics.  If  all  variables  are  jointly  normally  distributed  in  stationary 
sequences,  simple  first  and  second  moments  contain  all  the  information. 

If  not,  there  is  the  possibility  that  some  of  the  needed  information  is 
not  contained  in  the  statistics  used .  When  a  random  sequence  is  other 
than  stationary  and  jointly  normal,  it  may  sometimes  equally  well  be 
described  and  thought  of  as  stationary  but  not  jointly  normal  (which  is 
the  terminology  used  here)  or  as  nonstationary.  ^ 

The  topic  is  most  easily  illustrated  in  the  context  of  simple  regres¬ 
sion.  Suppose  that  we  desire  to  be  able  to  estimate  the  unobserved  value 
of  y  when  x  is  observed,  that  x  and  y  are  random  variables  with  a 
joint  distribution,  and  that  we  have  a  large  sample  of  independent  (x,  y) 
observations  with  which  to  estimate  the  relation  between  y  and  x  .  If 
the  joint  distribution  of  x  and  y  is  normal,  first  and  second  moments 
of  the  sample  are  sufficient  statistics,  the  regression  curve  of  y  on  x 
(that  is,  the  conditional  expectation  of  y  ,  given  x)  is  linear  and  well 
estimated,  for  many  purposes,  by  the  usual  least-squares  regression  line. 

In  particular,  the  ordinate  of  the  fitted  line  is  the  best  estimate  of  y  , 
given  x  ,  in  the  absence  of  prior  information  about  the  parameters  (that 
is,  for  a  flat  prior)  and  for  any  loss  function  that  is  not  constant  and 
is  a  nondecreasing  function  of  the  magnitude  of  the  error. 

When  the  joint  distribution  for  x  and  y  is  not  normal,  the  usual 
regression  line  may  be  less  satisfactory.  Consider  three  examples. 

(i)  If  the  regression  curve  of  y  on  x  is  not  linear,  fitting  a 
linear  regression  relation  to  some  data  may  give  a  poor  way  of  estimating 
y  for  given  x  .  Let 

y  -  u  +  3 1  x  |  +  e  , 

where  u  and  B  are  constants,  x  is  distributed  N(0,  1)  and  c  is 
distributed  independently  N(0,  o2)  ,  and  o2  <  62  .  The  usual  regression 
line  is  useless  and  suggests  a  much  larger  residual  variance  than  the 
correct  o2  .  Here  the  marginal  distribution  for  y  is  not  normal. 
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(ii)  If  x  and  y  are  marginally  normal  and  are  jointly  distributed 
with  constant  crossproduct  probability  ratio  different  from  1  (Plackett, 

1965),  the  regression  curve  of  y  on  x  is  monotone  but  not  linear.  This 
is  a  less  dramatic  example  of  the  same  effect  as  at  (i);  the  usual  regres¬ 
sion  line  is  not  useless,  but  it  is  not  the  best  predictor. 

(ill)  Even  if  the  regression  curve  of  y  on  x  is  linear,  the  usual 
regression  line  may  give  a  poor  estimate  of  y  ,  given  x  ,  if  the  loss 
function  is  sufficiently  different  from  squared  error.  Let  x  be  distributed 
N(0,  1)  and,  given  x  ,  let  y  have  probability  %  of  being  equal  to  x  , 
and  probability  %  of  being  an  independent  N(0,  1)  variable.  The  regres¬ 
sion  curve  of  y  on  x  Is  then  y  -  4x  .  Let  the  loss  be  0  if 

| error  |  n  <5  ,  otherwise  1  ,  where  6  is  small.  Then  for  given  x  , 

if  say  x  >  0  ,  a  good  estimate  of  y  is  max[0,  x  -  6]  .  Here,  as  for 

(11),  the  marginal  distributions  for  x  and  y  are  normal. 

These  considerations  do  not  seem  very  sinister  in  regard  to  simple 
linear  regression,  because  a  scatterplot  of  the  given  (x,  y)  observa¬ 
tions  would  most  likely  reveal  any  such  effects.  No  special  machinery 
seems  to  be  called  for.  Similar  considerations  apply  to  multiple  regres¬ 
sion  on  several  explanatory  variables.  Obtaining  a  comprehensive  under¬ 
standing  of  how  the  variables  are  related  through  examining  scatterplots 
is  less  easy,  though  still  possible.  Various  tests  can  also  be  calculated 
from  the  residuals. 

Perceiving  nonnormality  in  the  joint  distribution  of  a  single  time 
series,  or  of  several  related  time  series,  is  difficult  from  such  graphical 
displays  as  are  commonly  made  in  treating  time  series.  A  correlogram  or 
spectrum  (or  cross-correlogram  or  cross-spectrum)  does  not  help  much.  We 
here  propose  an  adaptation  to  stationary  time  series  of  Mardia's  test  for 
kurtosis  in  a  multivariate  distribution.  In  this  preliminary  report,  suit¬ 
able  test  statistics  are  proposed,  some  information  is  given  concerning 
their  distribution  under  the  null  hypothesis,  with  a  suggested  computer 
program  for  making  the  tests,  and  there  is  a  brief  consideration  of  power. 
Further  study  of  these  matters,  and  examples  of  application,  will  be  pre¬ 
sented  later. 
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2.  Tests  for  kurtosls  in  stationary  time  series 

A  given  time  series  {xt}  »  f°r  some  consecutive  integer  values  for 
t  ,  is  supposed  to  be  realized  from  a  stationary  sequence  of  random  vari¬ 
ables  »  and  we  wish  to  test  the  hypothesis  that  the  random  variables 

are  jointly  normally  distributed. 

Univariate  normality  of  the  marginal  distribution  for  £  could  be 
tested  by  making  a  histogram  of  the  aggregate  of  given  values  {xt}  or  by 
calculating  (for  example)  a  kurtosis  statistic. 


b2  «  N(Et(xt  -  x)‘4)/(Et(x(.  -  x)2)2  , 

where  N  is  the  number  of  t-values  and  Nx  -  x  .  To  determine  a  sig¬ 
nificance  level  for  b2  ,  proper  account  would  have  to  be  taken  of  the 
correlation  structure  of  the  sequence  {f;^}  • 

Univariate  marginal  normality  of  a  stationary  random  sequence  does  not 
imply  joint  normality,  and  the  latter  is  what  we  are  interested  in  here. 
Mardla  (1970)  has  considered  testing  joint  normality,  given  n  independent 
observations  of  a  p-variate  distribution.  His  procedure  involves  linearly 
transforming  the  p-variate  distribution  so  that  it  becomes  spherical,  and 
then  he  considers  the  n  distances  of  the  observations  from  their  center 
of  gravity  and  constructs  a  kurtosis  statistic  by  comparing  the  sum  of  the 
fourth  powers  of  the  distances  with  the  squared  sum  of  the  squared  distances. 
An  analogous  way  to  treat  a  stationary  time  series  would  be  to  express  the 
series  in  terms  of  independent  identically  distributed  "innovations"  and 
then  calculate  kurtosis  statistics  either  from  single  innovations,  or  from 
pairs  of  consecutive  innovations,  or  from  triples  of  consecutive  innovations, 
etc. 


Our  suggested  procedure  is,  first  of  all,  to  try  to  represent  the 
sequence  in  finite  autoregressive  form,  say 


tet  ~  w)  -  “  b)  - 


-  a(E  -  u) 
P  t-P 


(t  -  0,  tl,  t2,  ...) 


are  constants  and  {et>  are  independent  identically 


on 


u,  ) 

t-p 


where  we  may  say  that  1  S  t  i  n  if  the 


(1) 


where  p,  olt  . ..,  < 
distributed  "error"  random  variables  having  a  normal  distribution  with  zero 
mean.  For  a  given  positive  integer  p  ,  such  a  finite  autoregressive 
structure  can  be  estimated  by  performing  ordinary  linear  regression  of  {x^} 
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whole  given  series  is  of  length  n  +  p  ,  that  is,  the  given  series  is 
{xt}  for  1  -  p  5  t  S  n  .  The  residuals  {ut>  are  the  "innovations", 
estimating  the  errors  { e t }  : 

(1  S  t  <  n)  ,  (2) 


Xt  “  ao  '  alXt-l 


-  a  x. 

P  t-p 


where 


*1  » 


a  are  the  regression  coefficients.  The  innovations 


depend  on  the  choice  of  p  ,  A  possible  method  of  choosing  p  is  to  cal¬ 
culate  the  empirical  discrete  spectrum  of  the  given  series  (prewhitened  and 
tapered),  smooth  it  with  a  suitable  moving  average  to  form  (after  adjust¬ 
ing  for  the  prewhitening)  an  estimated  spectral  density,  and  then  try  to 
approximate  the  reciprocal  of  the  spectral  density  by  a  low-order  poly¬ 
nomial  in  the  cosine  of  the  angular  frequency;  p  is  taken  to  be  the  degree 
of  the  polynomial.  We  shall  suppose  that  n  is  much  larger  than  p  . 
Formation  of  innovations  has  been  recently  discussed  by  Kleiner,  Martin  and 
Thomson  (1979),  for  a  different  purpose. 

For  a  given  vector  of  innovations  (ut)  ,  a  kurtosis  statistic  can  be 
defined  from  single  innovations: 


b21  =  1  E  ut2)2  • 

t=l  t=l 

A  kurtosis  statistic  defined  from  pairs  of  consecutive  innovations  is 

b  -  =  n(  I  (u  2  +  u  2)2)/(  l  u  2)2  , 
t=l  C  t=l  t 

and  one  based  on  triples  of  consecutive  innovations  is 

b23  =n(nz2  (ut2  +  ut+12+ut+22)2)/(zn  ut2)2  , 
t=l  t»l 


(3) 


(4) 


(5) 


and  so  on. 

If  indeed  the  sequence  correctly  described  by  an  expression 

of  the  form  (1),  for  some  finite  p  ,  the  left  side  of  (1)  is  a  sequence  of 
independent  identically  distributed  normal  variables.  Then  if  the  correct 
value  for  p  is  used,  the  innovations  {ut}  will  presumably  seem  to  be 
realized  from  nearly  independent  identically  distributed  normal  variables 
and  the  kurtosis  statistics  t>21,  b22,  ...  should  behave  accordingly.  In 
particular,  if  n  is  large,  b  is  expected  to  be  near  to  E(et4) /(Eet2)2  , 
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which  is  equal  to  3,  t>22  is  expected  to  be  near  to  E((et2  +  et+12)2)/(E  et2)2 

which  is  equal  to  8,  b23  is  expected  to  be  near  to  15,  etc. 

If  the  sequence  {£t)  is  jointly  normal  but,  for  some  p  ,  cannot  be 
represented  in  the  form  (1),  either  because  a  larger  value  for  p  would  be 
needed  or  because  the  sequence  does  not  have  a  finite  autoregressive  expres¬ 
sion,  and  if  for  the  chosen  value  of  p  the  parameters  y,  Oj ,  . ..,  are 

chosen  to  minimize  the  variance  of  the  left  side  of  (1),  then  the  left  side 
constitutes  a  stationary  sequence  of  normal  variables  that  are  not  independent. 
The  innovations  calculated  for  that  p  will  presumably  also  seem  to  be 
realized  from  correlated  normal  variables.  Correlation  in  the  innovations 
may  be  expected  to  have  less  effect  on  b21  than  on  b22,  b23,  ... 

3 .  Distributions  under  the  null  hypothesis 

To  approximate  the  distributions  of  the  statistics  b^,  b^,  etc.  under 
the  null  hypothesis  of  stationarity  and  joint  normality,  it  is  natural  to 
consider  moments.  It  has  been  found  that  the  distribution  of  the  ordinary 
kurtosis  statistic,  Pearson's  b2  or  Fisher's  g2  ,  in  samples  from  a 
normal  population,  is  fairly  well  approximated  by  a  linear  function  of  the 
reciprocal  of  a  x2  variable,  having  a  distribution  of  Pearson's  Type  V, 
fitted  to  the  first  three  moments  (Anscombe  and  Glynn,  1975).  Accordingly 
we  seek  to  determine  the  first  three  moments  of  the  distributions  of  b21, 
b22>  ...,  in  order  to  be  able  to  make  the  Type  V  approximation. 

First  suppose  that  {ut>  in  £be  definitions  (3),  (4)  and  (5)  are  not 
as  specified  at  (2)  but  instead  are  independent  N(0,  1)  variables.  Since 
the  ratio  on  the  right  side  of  each  definition  is  then  independent  of  Its 
denominator,  relations  such  as  this  hold: 

E(b2ir)  E(Et  utz)2r  -  nr  E(Et  (r  -  1,  2,  ...)  . 

The  following  results  may  be  deduced. 

For  nil, 

E^b21^  “  n  +  2  ’ 

24  n2(n  -  1) 

(n  +  2)2(n  +  4)(n  +  6) 


var(b2J) 


24 

n  +  15 
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E (b 2 j  -  Eb21)3 


W 


_ 1728  n3(n  -  l)(n  -  2) _ 1728 

(n  +  2)3(n  +  4)(n  +  6)(n  +  8)(n  +  10)  ~  n(n  +  37)’ 
6/6  ^  14,70 

/n  +  29  /n  +  29 


The  asymptotic  results  after  the  ~  sign  are  correct  to  a  factor  1  +  0(n~2) 
when  n  is  large.  The  skewness  measure  Yj  means  the  third  central  moment 
divided  by  the  standard  deviation  cubed. 

Similarly  for  n  >  2  , 

8n 


E(b22> 


8(n  -  1) 
n  +  2 


n  +  3 


var(b  )  -  IK".  -  2H7nf  +  2n  4-  48)..  112 

22  (n  +  2)2(n  t  4)(n  +  6)  n  +  15|  ’ 


and  for  n  >  3  , 

E(b  -  Eb  )3  -  256(65ns  -  358n4  +  996n3  -  1928n2  +  5152n  -  7680) 
22  22  (n  +  2)3(n  +  4) (n  +  6)(n  +  8) (n  +  10) 

16640 


n<n  +  39g|) 


260  A  14.04 


Yi(b?,)~-  _ _ _  _  . 

7/74T3IJH:  /n  +  31.9 


For  n  >  3  , 


and  for  n  >  4 


Efb  )  _  15(n  -  2)  15n 

E(b23) - iTTl  tTTT  • 


var (b  )  -  8(37°3  ~  3n2  ±  296n  ~  2700)  296 

23  (n  +  2)2<n  +  4)<n  +  6)  ^  n  +  14^  * 

and  for  n  >  6  , 

E(b  -  Eb  )3  -  ^QSSn5  -  2709n4  +  9414n3  -  8516n2  +  79800n  -  540000) 
23  23  (n  +  2) 3 (n  +  4) (n  +  6) (n  +  8) (n  +  10) 

70080 
n(n  +  41 

Y  (b  )  -  ~~~~  ~  (approximately). 

1  23  /rT+4076 
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Determlnation  of  these  expressions  has  been  partly  computerized,  and  there 
is  reason  to  hope  that  correctness  has  been  achieved.  The  skewness  measure 
(Yj)  of  the  distribution  of  each  of  the  statistics  b2J,  h22,  b23  is  nearly 
the  same,  when  n  is  large.  This  suggests  that  the  shapes  of  the  distribu¬ 
tions  may  be  similar. 

It  is  possible  in  principle  to  obtain  exact  expressions  for  moments  of 
the  statistics,  supposing  that  (ut)  in  the  definitions  (3),  (4),  (5)  are 
residuals  from  the  fitting  by  least  squares  of  a  linear  regression  relation 
on  given  explanatory  variables,  the  errors  being  independent  N(0,  1)  vari¬ 
ables  (Anscombe,  1961).  The  exact  expressions  depend  on  the  projection 
matrix  Q  that  transforms  the  errors  into  the  residuals,  but  under  mild 
conditions  on  the  explanatory  variables  the  asymptotic  expressions  quoted 
above,  correct  to  a  factor  1  +  0(n“2)  ,  remain  valid. 

For  example,  if  only  a  general  mean  is  estimated,  the  {ut)  are 
independent  N(0,  1)  variables  with  their  average  subtracted.  Results  for 
b21  in  this  case,  for  n  >  2,  are  due  essentially  to  Fisher  (1930): 


E(b21)  - 


3(n  -  1) 
n  +  1 


3n 

n  +  2  » 


var(b21)  * 


24n(n  -  2)(n  -  3) _ 24_ 

(n  +  l)2(n  +  3)(n  +  5)  ~  n  +  15 


fk  _  1728n(n  -  2) (n  -  3)(n2  -  5n  +  2)  1728 

21  21  (n  +  l)3(n  +  3)(n  +  5)(n  +  7)(n  +  9)  n(n  +  37) 

The  asymptotic  results  are  the  same  as  before.  Now  let  {ut>  be  residuals 
from  the  fitting  of  a  general  mean  and  also  a  regression  coefficient  on  an 
explanatory  variable  (zt)  .  Let  {z^}  be  scaled  by  subtracting  the 
average  and  dividing  by  the  square  root  of  the  sum  of  squares  (assumed  to 
be  positive).  Then  E  z  =0,  E  z  2  =  1.  We  require  that  uniformly  for 
every  t  ,  -  0(n”*)  ;  in  particular,  Efc  z ^  =  0(n-1)  .  This  will 

happen  in  probability  if  {z^}  »  before  the  scaling  just  mentioned,  was  a 
random  sample  from  some  stationary  random  sequence  having  positive  variance 
and  all  moments  finite.  The  condition  prevents  the  regression  coefficient 
on  { z ^ )  from  being  largely  determined  by  a  single  reading.  We  find  (for 


3) 


E(b2i) 


n  -  2 


f (n  -  l)(n  -3)  _  41  3n 

i  n  Et  zt  1  ~  rn  • 
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Similar  exact  expressions  can  be  exhibited  for  other  moments,  though  much 
clumsier. 

In  fact,  our  time-series  innovations  {ut}  defined  at  (2)  above  are 
residuals  from  the  fitting  by  least  squares,  not  of  a  linear  regression 
relation  on  given  explanatory  variables,  but  of  a  linear  autoregression 
relation.  We  conjecture,  but  have  not  proved,  that  the  same  asymptotic 
results  hold. 

Below  is  given  an  APL  program  for  making  the  b21,  b22  and  b23  tests 
on  a  given  vector  of  innovations.  Equivalent  normal  deviates  (E.N.D.)  are 
calculated  from  the  conjectured  asymptotic  moments,  the  Type  V  approximation 
and  the  Wilson-Hilfer ty  approximation  to  the  distribution  of  x2  • 


V  TSNT  UiBiDiEiMiNiS 

[lj  ->2+3  =LJ NC  'END' 

[2]  >0 , pLR  ' COPY  END  FROM  1234  ASP2 ' 

L  3  J  >4+(a/  ,  0<D+-(+ / S+-U*U  )*2  )  a(  l<itf<-+/  pl/)Al=ppy 

[4]  >0 , pLK ' NO  GO.' 

Lb]  '621  =  '  tl6<-(N*S+.*S)iU 

L6]  'APPROXIMATE  MEAN  =  •  ,(?M>3U+2*tf ) , ' ,  S.E.  =  * ,  (▼£+-( 24 *tf+ 15  )*  +  2),  • , 
GAMMA1  =  '  ,?G«-14.7f(/V+29)**2 

L7]  '  (E.N.D.  =  ’ ,(2f  END), ' ) ’ 

L8]  'D22  =  '  ,lBHN*+/((US)+  US)*2)iD 

L9]  'APPROXIMATE  MEAN  =  ' ,  (vM+-8*l+3*N) , ' ,  S.E.  =  (?£«-(  11 2*0+1 5. 7)**2), ' 

,  GAMMA 1  =  ’  ,TG->14.O4*(0+31.9)**2 

C 10  J  '  (E.N.D.  =  \(2t END),')' 

Lll  ]  'B 23  =  ' ,  v5«-(0x+/ ( ( 2+-S)  +  ( l+~l+5)+_2+5)*2 )+£) 

L12J  'APPROXIMATE  MEAN  =  '  ,(*M«-15U+4*A0, ' ,  S.E.  =  ',(▼£«-(  296*0+14.1 )**2) , 

GAMMA 1  =  ' ,tG>13.76*(0+4O.6)**2 

L 13]  '  (E.N.D.  =  ',(2 VEND),')' 

L14J  ft  TIME  SERIES  NORMALITY  TEST.  THE  ARGUMENT  IS  A  VECTOR  OF  INNOVATIONS. 

V 


V  X*-END  ;A 

Ll]  A-*-6+4  xA  *A+40A-*-2  *G 

L2]  >3+O<X<-l  +  (0-M)*£*(2*/l-4)**2 

L3]  ->0,pL J*-'E.N.U.  NOT  FOUND.' ,X+" 

L4]  *>(1-(2*9x4)+((1-2*A)**)**3)*(2*9xA)**2 

C 5 J  ft  INVOKED  BY  "ISN'T'. 

V 


i 


4.  Power  considerations 


The  tests  are  intended  to  be  responsive  to  nonnormality  in  the  joint 
distribution  of  the  random  sequence.  They  should  preferably  respond  little 
to  specification  error  in  a  jointly  normal  random  process,  that  is,  to 
choosing  too  low  a  value  for  the  order  p  of  the  autoregressive  structure 
f it ted . 

Suppose  that  p  is  chosen  to  be  1.  For  present  purposes  the  mean  of 
the  sequence  may  be  set  equal  to  0.  Then,  if  p  is  correct,  the  null 
hypothesis  is  that  {£t}  is  a  jointly  normal  stationary  Markov  sequence: 

Hypothesis  A:  ^  =  i  +  Et  ’  where  p  is  constant  (|p|  <  1)  and 
e t  is  distributed  N(0,  1  -  p2)  independently  of  £ 

Ct_2* 

An  alternative  hypothesis  involving  marginal  normality  but  not  joint 
normality  is  that  {5^}  is  a  stationary  Markov  "jump"  sequence: 

Hypothesis  B:  With  probability  p/a,  ^t=a?t  +Et’  w^en  a  and  P 
are  constant  (0<p<a£l)  and  e  is  distributed 
N(0,  1  -  a2)  independently  of  ,  Ct  2*  •••  ;  and 

with  probability  I  -  p/a  ,  =  e  *  ,  distributed 

N(0,  1)  independently  of  Ct_2,  ••• 

When  a  =  1  ,  a  realization  of  this  sequence  is  quite  unlike  a  realization 
of  Hypothesis  A  with  the  same  value  for  p  .  But  when  a  is  close  to  p  , 
realizations  differ  in  appearance  only  subtly:  with  Hypothesis  B  occasional 
large  jumps  are  more  frequent  than  with  Hypothesis  A.  Something  like  this 
kind  of  joint  nonnormality  is  sometimes  observed  in  practice. 

An  alternative  hypothesis  involving  joint  normality  but  incorrect 
specification  is 

Hypothesis  C:  is  a  jointly  normal  stationary  autoregressive  sequence 

of  order  greater  than  1,  or  a  jointly  normal  stationary 
moving-average  sequence. 


In  each  case,  if  p  is  the  lag-1  serial  correlation  coefficient,  let 

nt  =  Ct  "  pCt-l  • 

Then  as  n  -+■  «  the  kurtosis  statistics  converge  in  probability: 
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E(nt4) 

’21  "  (Ent2)2  ’ 


E(n. 


+  n 


2%2 


b  -*■ 
22 


t+1 


E(n  2  +  n 


t+i 


+  n 


2^2 


t+2 


(Ent2)2 


23 


(Ent2)2 


Under  Hypothesis  A  these  limits  are  3,  8  and  15,  respectively. 
Under  Hypothesis  B, 


E(nt4) 

(Ent2)2 


-  ,  12p3 (a  -  p) 
(1  -  P2)2  ’ 


E(v  + 

(Ent2)2 


0  t  4p (a  -  p)(l  +  4p2  +  ap3) 

0^7575 


E(n  2  +  n  2  +  n  2)2 

t  t+i  t+2 


(Ent2)2 


15  +  4p(a  -  P)(2  +  aP  +  5P?  +  a2P4) 
(1  -  P2)2 


(The  formidable  polynomial  manipulations  have  been  computerized,  and  the 
above  results  are  believed  to  be  correct.) 

For  an  asymptotic  measure  of  power  when  n  is  large,  the  excesses  of 
these  expressions  over  the  null-hypothesis  values  of  3,  8,  15  may  be  divided 
by  the  (conjectured)  asymptotic  standard  deviations,  /24/n,  /112/n,  /296/n, 
respectively.  Suppose  that  a  >  0.9  (say).  Then  b22  is  more  powerful 
than  b21  when  p  is  less  than  0.75  about  (the  critical  value  varies  a 
little  with  a  );  b22  is  much  more  powerful  when  p  is  near  0;  it  is 

a  little  less  powerful  when  p  exceeds  the  critical  value.  And  b23  is 
more  powerful  than  b22  when  p  is  less  than  0.7  about. 

Under  Hypothesis  C,  nt>  nt+1,  nt+2  are  jointly  normally  distributed 

with,  in  general,  nonzero  correlation.  Let  the  lag-h  serial  correlation 

coefficient  be  6,  .  Then 

h 


E(n4)  E(n  2  +  n  ,  2)2 

t  .  3>  - V— r? - =  8  +  46 j2. 


(Ent2)2 


(Ent2)2 


E(n  2  +  n  2  +  n  2)2 

t  t+i _ t+2 

(Ent2)2 


=  15  +  86  j 2  +  4622 


The  sampling  distributions  for  the  kurtosis  statistics  will  be  affected  by 
the  lack  of  independence  of  »  ^ut  t*le  limiting  value  for  b2J  is 
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not  affected  by  the  specification  error.  The  limiting  values  for  b22  and 
b23  are  little  affected  if  their  excesses  over  the  null-hypothesis  values, 
namely  46j2  and  86^  +  462’  ,  are  small  compared  with  the  respective 
standard  deviations.  The  values  of  and  62  can  be  estimated  from  the 

innovations  {u^}  as  their  lag-1  and  lag-2  serial  correlation  coefficients 
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